Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Multi-category active learning algorithm based on multiple clustering algorithms and multivariate linear regression
WANG Min, WU Yubo, MIN Fan
Journal of Computer Applications    2020, 40 (12): 3437-3444.   DOI: 10.11772/j.issn.1001-9081.2020060921
Abstract364)      PDF (1151KB)(425)       Save
Concerning the problem that traditional lithology identification methods have low recognition accuracy and are difficult to integrate with geological experience organically, a multi-category Active Learning algorithm based on multiple Clustering algorithms and multivariate Linear regression algorithm (ALCL) was proposed. Firstly, the category matrix corresponding to each algorithm was obtained through multiple heterogeneous clustering algorithms, and the category matrices were labeled and pre-classified by querying common points. Secondly, the key examples used to train the weight coefficient model of the clustering algorithm were selected through the proposed priority largest search strategy and the most confusing query strategy. Thirdly, the objective solving function was defined, and the weight coefficients of clustering algorithms were obtained by training the key examples. Finally, the samples with high confidence in the results were classified by performing the classification calculation combined with the weight coefficient. Six public lithology datasets of oil wells in Daqing oilfield were used to carry out experiments. Experimental results show that when the classification accuracy of ALCL is the highest, it is improved by 2.07%-14.01% compared with those of the traditional supervised learning algorithms and other active learning algorithms. The results of hypothesis test and significance analysis prove that ALCL has better classification effect in lithology identification.
Reference | Related Articles | Metrics
Cost-sensitive active learning through farthest distance sum sampling
REN Jie, MIN Fan, WANG Min
Journal of Computer Applications    2019, 39 (9): 2499-2504.   DOI: 10.11772/j.issn.1001-9081.2019020763
Abstract446)      PDF (862KB)(321)       Save

Active learning aims to reduce expert labeling through man-machine interaction, while cost-sensitive active learning focuses on balancing labeling and misclassification costs. Based on Three-Way Decision (3WD) methodology and Label Uniform Distribution (LUD) model, a Cost-sensitive Active learning through the Farthest distance sum Sampling (CAFS) algorithm was proposed. Firstly, the farthest total distance sampling strategy was designed to query the labels of representative samples. Secondly, LUD model and cost function were used to calculate the expected sampling number. Finally, k-Means algorithm was employed to split blocks obtained different labels. In CAFS, 3WD methodology was adopted in the iterative process of label query, instance prediction, and block splitting, until all instances were processed. The learning process was controlled by the cost minimization objective. Results on 9 public datasets show that CAFS has lower average cost compared with 11 mainstream algorithms.

Reference | Related Articles | Metrics